Session Title: Multiple Linear Regression with PyTorch


Objective : In this session, we will:

  1. Build a multiple linear regression model using PyTorch.

  2. Use sample data generated programmatically.

  3. Explain each line of code in detail.

  4. Understand the real-world relevance of multiple linear regression.


1. What is Multiple Linear Regression? Theory :

\[ y = w_1x_1 + w_2x_2 + \ldots + w_nx_n + b \]

where:


2. sample Data Generation We’ll create a dataset with three features (\(x_1, x_2, x_3\)) and a target (\(y\)) using the formula:

\[ y = 3x_1 + 2x_2 - x_3 + \text{noise} \]

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt

# Hyper-parameters
input_size = 3  # Three independent variables
output_size = 1  # One target variable
num_epochs = 100
learning_rate = 0.01

# Generate sampledata
np.random.seed(42)
x_train = np.random.rand(100, 3) * 10  # 100 samples, 3 features
weights = np.array([3.0, 2.0, -1.0])  # True weights
bias = 5.0  # True bias
noise = np.random.randn(100, 1)  # Add some noise

# Calculate y using the linear relationship
y_train = np.dot(x_train, weights.reshape(-1, 1)) + bias + noise

Code Explanation :

  1. Input Data ($$ x$$):
  1. True Weights and Bias :
  1. Noise :
  1. Target Variable (\(y\)) :

3. Building the Model Model Definition :

# Define the linear regression model
model = nn.Linear(input_size, output_size)

4. Define Loss Function and Optimizer Loss Function :

criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

5. Training the Model Training Loop :

for epoch in range(num_epochs):
    # Convert numpy arrays to torch tensors
    inputs = torch.from_numpy(x_train).float()
    targets = torch.from_numpy(y_train).float()
    
    # Forward pass: Compute predictions
    outputs = model(inputs)
    loss = criterion(outputs, targets)
    
    # Backward pass: Compute gradients
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    if (epoch + 1) % 10 == 0:
        print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")

Line-by-Line Explanation :

  1. Convert Data to Tensors :
  1. Forward Pass :
  1. Calculate Loss :
  1. Backward Pass :
  1. Update Weights :
  1. Monitor Training :

6. Visualizing Results Plot the True vs. Predicted Values

# Evaluate the model
model.eval()
predicted = model(torch.from_numpy(x_train).float()).detach().numpy()

# Visualization
plt.figure(figsize=(10, 6))
plt.scatter(range(len(y_train)), y_train, color='blue', label='True Values')
plt.scatter(range(len(predicted)), predicted, color='red', label='Predicted Values')
plt.title('True vs Predicted Values')
plt.xlabel('Sample Index')
plt.ylabel('Target Value')
plt.legend()
plt.show()

Explanation :

  1. model.eval(): Switches the model to evaluation mode (disables gradient calculations).

  2. detach(): Prevents tracking gradients for predictions (not needed during testing).

  3. Plot :


7. Saving the Model

torch.save(model.state_dict(), 'multiple_linear_regression.ckpt')

Complete Code

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt

# Hyper-parameters
input_size = 3
output_size = 1
num_epochs = 100
learning_rate = 0.01

# Generate sampledata
np.random.seed(42)
x_train = np.random.rand(100, 3) * 10
weights = np.array([3.0, 2.0, -1.0])
bias = 5.0
noise = np.random.randn(100, 1)
y_train = np.dot(x_train, weights.reshape(-1, 1)) + bias + noise

# Linear regression model
model = nn.Linear(input_size, output_size)

# Loss and optimizer
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

# Train the model
for epoch in range(num_epochs):
    inputs = torch.from_numpy(x_train).float()
    targets = torch.from_numpy(y_train).float()
    
    outputs = model(inputs)
    loss = criterion(outputs, targets)
    
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    if (epoch + 1) % 10 == 0:
        print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")

# Visualize results
model.eval()
predicted = model(torch.from_numpy(x_train).float()).detach().numpy()
plt.figure(figsize=(10, 6))
plt.scatter(range(len(y_train)), y_train, color='blue', label='True Values')
plt.scatter(range(len(predicted)), predicted, color='red', label='Predicted Values')
plt.title('True vs Predicted Values')
plt.xlabel('Sample Index')
plt.ylabel('Target Value')
plt.legend()
plt.show()

# Save the model
torch.save(model.state_dict(), 'multiple_linear_regression.ckpt')

Learning Outcomes

  1. Understand multiple linear regression and how to implement it in PyTorch.

  2. Learn to generate sample data for experimentation.

  3. Visualize true vs. predicted values to evaluate model performance.

created with the evaluation version of Markdown Monster